Dynamic  Programming  Algorithms 
for  Planning  and  Robotics 
in  Continuous  Domains 
and  the  Hamilton-Jacobi  Equation 


Ian  Mitchell 

Department  of  Computer  Science 
University  of  British  Columbia 


research  supported  by 

the  Natural  Science  and  Engineering  Research  Council  of  Canada 
and  Office  of  Naval  Research  under  MURI  contract  N0001 4-02-1 -0720 


Report  Documentation  Page 

Form  Approved 

0MB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  0MB  control  number. 

1.  REPORT  DATE 

22  SEP  2008  TYPE 

3.  DATES  COVERED 

00-00-2008  to  00-00-2008 

4.  TITLE  AND  SUBTITLE 

Dynamic  Programming  Algorithms  for  Planning  and  Robotics  in 
Continuous  Domains  and  the  Hamilton-Jacobi  Equation 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

University  of  British  Columbia, Department  of  Computer  Science, 2366 
Main  Mall, Vancouver,  BC,  Canada  V6T  1Z4, 

8.  PEREORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR’S  ACRONYM(S) 

11.  SPONSOR/MONITOR’S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

15.  SUBJECT  TERMS 

16.  SECURITY  CLASSIEICATION  OE:  17.  LIMITATION  OE 

ARSTRAUT 

18.  NUMBER  19a.  NAME  OE 

OE  PAGES  RESPONSIBLE  PERSON 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE  Same  aS 

unclassified  unclassified  unclassified  Report  (SAR) 

72 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Outline 


Cost  Map 


Introduction 

-  Optimal  control 

-  Dynamic  programming  (DP) 

Path  Planning 

-  Discrete  planning  as  optimal  control 

-  Dijkstra’s  algorithm  &  its  problems 

-  Continuous  DP  &  the  Hamilton- 
Jacobi  (HJ)  PDE 

-  The  fast  marching  method  (FMM): 
Dijkstra’s  for  continuous  spaces 

Algorithms  for  Static  HJ  PDEs 

-  Four  alternatives 


-0.8  -0.6  -0.4  -0.2 


-  FMM  pros  &  cons 
•  Generalizations 

-  Alternative  action  norms 

-  Multiple  objective  planning 


22  Sept  2008 


Ian  Mitchell,  University  of  British  Columbia 


0  50  100  150  200  250  300  350  400 

x(knn) 


Basic  Path  Planning 

•  Find  the  optimal  path  p{s)  to  a  target  (or  from  a  source) 

•  Inputs 

-  Cost  c(x)  to  pass  through  each  state  in  the  state  space 

-  Set  of  targets  or  sources  (provides  boundary  conditions) 


cost  map  (higher  is  more  costly) 


Cost  Function  Map 
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Discrete  vs  Continuous 

•  Discrete  variable 

-  Drawn  from  a  countable  domain,  typically  finite 

-  Often  no  useful  metric  other  than  the  discrete  metric 

-  Often  no  consistent  ordering 

-  Examples:  names  of  students  in  this  room,  rooms  in  this 
building,  natural  numbers,  grid  of  M'',  ... 

•  Continuous  variable 

-  Drawn  from  an  uncountable  domain,  but  may  be  bounded 

-  Usually  has  a  continuous  metric 

-  Often  no  consistent  ordering 

-  Examples:  Real  numbers  [  0,  1  ],  M'',  SO(3),  ... 
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Classes  of  Models  for  Dynamic  Systems 


Discrete  time  and  state 

Continuous  time  /  discrete  state 

-  Discrete  event  systems 

Discrete  time  /  continuous  state 

Continuous  time  and  state 

Markovian  assumption 

-  All  information  relevant  to  future 
evolution  is  captured  in  the  state  variable 

-  Vital  assumption,  but  failures  are  often 
treated  as  nondeterminism 

Deterministic  assumption 

-  Future  evolution  completely  determined 
by  initial  conditions 

-  Can  be  eased  in  many  cases 

Not  the  only  classes  of  models 


x(t  +  1)  = 
where  a;  G  {. . .}  = 


x{t  +  1)  =  f{t,x{t)) 
where  a;  G  =  S 


where  x(t)  g  =  S 
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Achieving  Desired  Behaviours 

•  We  can  attempt  to  control  a  system  when  there  is  a  parameter  u 
of  the  dynamics  (the  “control  input”)  which  we  can  influence 

x{t  +  1)  =  f{x{t),u{t))  or 
X  =  f(x(t),u(t)) 
for  x{t)  E  E  U{x{t)) 

-  Time  dependent  dynamics  are  possible,  but  we  will  mostly  deal  with 
time  invariant  systems 

•  Without  a  control  signal  specification,  system  is  nondeterministic 

-  Current  state  cannot  predict  unique  future  evolution 

•  Control  signal  may  be  specified 

-  Open-loop  u(t)  orw.R^U 

-  Feedback,  closed-loop  u(x(t)) 

-  Either  choice  makes  the  system  deterministic  again 
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Objective  Function 


•  We  distinguish  quality  of  control  by  an  objective  /  payoff  /  cost 
function,  which  comes  in  many  different  variations 
-  eg:  discrete  time  discounted  with  fixed  finite  horizon 

tf-i 

u(t))  +  ct^f 

t=to 

where  x(-)  solves  x{t  +  1)  =  and  x{to)  — 


J(to,xo,u(-))  =  22 


-  eg:  continuous  time  no  discount  with  target  set  T 

J(xo,  u(-))  =  f  ^  +  gf(x(tf)) 

Jto 

where  x(-)  solves  x{t)  —  f{x{t),u{t))  and  x(to)  =  xq 
and  tf  —  min{i  |  x{t)  G  T} 
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Value  Function 


•  Choose  input  signal  to  optimize  the  objective 

-  Optimize:  “cost”  is  usually  minimized,  “payoff”  is  usually  maximized 
and  “objective”  may  be  either 

•  Value  function  is  the  optimal  value  of  the  objective  function 


^(*0)2^0) 


V{xq) 


min  J(to,a;o,u(-)) 
u{-)eu 


min 


a*  ^^g{x{t),u{t)) a^f  ^°gf(x(tf)) 


or 


'f 


"  9(x(t),u(t))  +gf(x(tf)) 
u{-)eUJto  •'  •' 


-  May  not  be  achieved  for  any  signal 

-  Set  of  signals  U  can  be  an  issue  in  continuous  time  problems  (eg 
piecewise  constant  vs  measurable) 
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Dynamic  Programming  in  Discrete  Time 

•  Consider  finite  horizon  objective  with  a  =  1  (no  discount) 

let  x(-)  solve  x{t  +  1)  =  and  x(to)  =  xq 

J(to,xo,u(-))  =  +  gf(x(tf)) 

t=to 

•  So  given  u{-)  we  can  solve  inductively  backwards  in  time  for 
objective  J(t,  x,  u(-)),  starting  a\t  =  tf 

J(tf,x(tf),u(-))  =  gf(x(tf)) 

J {ti,  x(ti) ,  u{-))  =  g(x{ti),u(ti))  +  J(tj+i,/(a;(tj),u(tj)),ii(-)) 
-  Called  dynamic  programming  (DP) 
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DP  for  the  Value  Function 


DP  can  also  be  applied  to  the  value  function 

-  Second  step  works  because  u(tQ)  can  be  chosen  independently  of 
u(t)  for  t  >tQ 

tf-i 

^(io>a:o)  =  min 

t=to 

—  min  (gix(tQ),u(tQ))  +  J(to  +  1,  a;(to  +  1); 
«(■) 


—  min  g(x(to),u')  +  V  {tQ  +  l,x{tQ  +  1)) 

(Jj 

=  min5(a;(to),'u)  +  ^ ,  f  ,  u(to))) 

(JJ 


V(tf,x(tf))  =gf{x{tf)) 
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Optimal  Control  via  DP 

•  Optimal  control  signal 

u*{t,  x(t))  =  argmin  [g{x{t),  u)  V (t  1 ,  f{x(t),  u))] 

U 

•  Optimal  trajectory  (discrete  gradient  descent) 

x*(t+  1)  =  f  (x* (t) ,  u* (t)) 

=  argmin  V{t -\- l,x{t -\- 1)) 

x(t+l)=/(x*(t),M) 

•  Observe  update  equation 

AV(t,x)  =  V(t  +  l,x(t  +  l))  -  V{t,x) 

^  -m\ng{x,t,u) 

(Jj 

•  Can  be  extended  (with  appropriate  care)  to 

-  other  objectives 

-  probabilistic  models 

-  adversarial  models 
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Basic  Path  Planning  (reminder) 

•  Find  the  optimal  path  p{s)  to  a  target  (or  from  a  source) 

•  Inputs 

-  Cost  c(x)  to  pass  through  each  state  in  the  state  space 

-  Set  of  targets  or  sources  (provides  boundary  conditions) 


cost  map  (higher  is  more  costly) 


Cost  Function  Map 
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Cost  Map 


22  Sept  2008 


Ian  Mitchell,  University  of  British  Columbia 


13 


Discrete  Planning  as  Optimal  Control 

•  Dynamics  x{t  +  1)  =  u  where  u  e 

•  Cost  to  reach  target  T  is 

T 

J{xo,u{-))  =  ^  l{x{t),u{t)) 

t=0 

where 

i{x,  u)  =  c{x)  and  T  =  min{t  >  0  |  x(t)  e  T} 

•  Value  function  (min  cost  to  target) 

'd(xo)  =  min  J(xo,ix(-)) 

u(-) 

•  Value  function  solves  recursion 

^(x)  =  min  [i9(y)  +  c{x)]  for  x 

yeN(x) 

'd^x)  =  0  for  X  G  T 
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Dynamic  Programming  Principle 

i9{x)  —  min  [&{y)  +  c(x)] 

y^N{x) 


•  Value  function  is  “cost  to  go”  from  x  to  the  nearest  target 

•  Value  7^x)  at  a  point  x  is  the  minimum  over  all  points  y  in  the 
neighborhood  N{x)  of  the  sum  of 

-  the  value  i^y)  at  point  y 

-  the  cost  c{x)  to  travel  through  x 

•  Dynamic  programming  applies  if 

-  Costs  are  additive 

-  Subsets  of  feasible  paths  are  themselves  feasible 

-  Concatenations  of  feasible  paths  are  feasible 

•  Compute  solution  by  value  iteration 

-  Repeatedly  solve  DP  equation  until  solution  stops  changing 

-  In  many  situations,  smart  ordering  reduces  number  of  iterations 
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Policy  (Feedback  Control) 

•  Given  value  function  t9(x),  optimal  action  at  a;  is  a;  -s-  y  where 
y  =  argmin  {y  G  N(x)  \  'diy)  +  c{x)} 

-  Policy  u{x)  =  y 


•  Alternative  policy  iteration  constructs  policy  directly 

-  Finite  termination  of  policy  iteration  can  be  proved  for  some 
situations  where  value  iteration  does  not  terminate 

-  Representation  of  policy  function  may  be  more  complicated  than 
value  function 
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Dijkstra’s  Algorithm  for  the  Value  Function 

•  Single  pass  dynamic  programming  value  iteration  on  a  discrete 
graph 

1 .  Set  all  interior  nodes  to  a  dummy  value  infinity  oo 

2.  For  all  boundary  nodes  x  and  all  y  e  N(x)  approximate  j^y)  by 
DPP 

3.  Sort  all  interior  nodes  with  finite  values  in  a  list 

4.  Pop  node  x  with  minimum  value  from  the  list  and  update  79(y)  by 
DPP  for  all  y  e  N(x) 
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Generic  Dijkstra-like  Algorithm 

foreach  xi  e  Qn\T  do  'd{xi)  < — \-oo 

foreach  e  T  do  ^  0 

Q  On 

while  Q  7^  0  do 

Xi  ExtractMin(Q) 
foreach  Xj  g  Mn{xi)  do 
'&(xj)  Update(xj, A/’nCiCj), 'j9, c) 

•  Could  also  use  iterative  scheme  by  minor  modifications  in 
management  of  the  queue 
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Typical  Discrete  Update 


•  Much  better  results  from 
discrete  Dijkstra  with  eight 
neighbour  stencil 

•  Result  still  shows  facets  in 
what  should  be  circular 
contours 


black:  value  function  contours 
for  minimum  time  to  the  origin 
red:  a  few  optimal  paths 


Update(a:j,  N{xj),  V,  c) 


min 


c(xj)  -\-V(xk) 
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Other  Issues 


Values  and  actions  are  not 
defined  for  states  that  are 
not  nodes  in  the  discrete 
graph 

Actions  only  include  those 
corresponding  to  edges 
leading  to  neighboring  states 

Interpolation  of  actions  to 
points  that  are  not  grid 
nodes  may  not  lead  to 
actions  optimal  under 
continuous  constraint 


two  optimal  paths  to  the 
lower  right  node 
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Deriving  Continuous  DP  (Informally) 


Discrete  dynamic  programming  principle 

=  min  +  c{x  y)] 

yeN(x) 

Continuous  DPP  for  path  p(-) 


'd(p(s))  =  min 

KO 

Rearrange 


+  As))  +  f  c{p{a))da 
J  s 


mm 

pO) 


^(p(^))  ~  d{p{s  +  As))  /s  c{p{a))da 


As 

Take  limit  As  ^  0 

dd^p^s)) 


As 


=  0 


mm 

pO) 


ds 


-  c(p(s)) 


=  0 
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The  Static  Hamilton-Jacobi  PDE 


After  limit  As  ^  0 


min 

K-) 


d'd^p^s)) 

ds 


+  c(p(s)) 


=  0 


Set  X  =p(s)  and  apply  chain  rule 


min 

K-) 


dd^x) dp{s) 
dx  ds 


+  c(x) 


•  Let  control  be  u(s)  =  and  observe 

that  only  dependence  on  p(-)  is  u 


rnin  [Dxd{x)  •  u  +  c{x)]  =  H{x,  Dxd{x))  =  0 

•  From  original  problem,  we  get  boundary 
conditions 


d(x)  =  0  for  X  E  dT 
•  Note:  a  very  informal  derivation 

Ian  Mitchell,  University  of  British  Columbia 
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Continuous  Planning  as  Optimal  Control 

•  Dynamics  x  =  u,  x  G  and  \\u\\2  <  1 

•  Cost  to  reach  target  is 


fT 

=  /  i{x{s),u{s))  ds 

Jo 

where 


i{x,  u)  =  c{x)  and  T  =  min{t  >  0  |  x(t)  e  T} 
•  Value  function  (min  cost  to  target) 


'dixo)  =  inf  J{xQ,u{-)) 

u(-) 


Value  function  solves  HJ  PDE  (\A/e  choose 


optimal  control  u(  )  = 


Dx’&jx) 


Dxnx}\\ 


) 


Dx'd{x)\\2  =  c(x)  for  X  G  \  T 
'd(x)  =  0  for  X  G  dT 
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Path  Generation 


•  Optimal  path  p{s)  is  found  by  gradient  descent 

-  Value  function  ^x)  has  no  local  minima,  so  paths  will  always 
terminate  at  a  target 


dp{s) 

ds 


Dxdjx) 

Dxd{x)\\ 


Value  Function  with  some  Sample  Paths 


Value  Function  with  some  Sample  Paths 


Allowing  for  Continuous  Action  Choice 


•  Fast  Marching  Method  (FMM): 
Dijkstra’s  algorithm  adapted  to 
a  continuous  state  space 

•  Dijkstra’s  algorithm  is  used  to 
determine  the  order  in  which 
nodes  are  visited 

•  When  computing  the  update  for 
a  node,  examine  neighboring 
simplices  instead  of 
neighboring  nodes 

•  Optimal  path  may  cross  faces 
or  interior  of  any  neighbor 
simplex 


Input:  xo,Af(xo),'d,c 
Output:  V(xo) 
foreach  S  e  N'si^o)  do 
Compute 

return 
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Solution  on  a  Simplex  (Finite  Difference) 


We  wish  to  solve 

Dx'&(x)  2  =  c(a;) 


'i?2  —  '^(^2) 


•'dl  = 


Construct  finite  difference  approximation 


Then  rearrange  to  find 

^  (^1  +  'i?2  +  \/2Ax^c(xo)^  -  (i?!  - 
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Solution  on  a  Simplex  (Semi-Lagrangian) 

•  We  wish  to  find  the  optimal  path  across  the  simplex 

•  Approximate  cost  of  travel  across  the  simplex  as  constant  c(xo) 

•  Approximate  cost  to  go  from  far  edge  of  simplex  as  linear 
interpolation  along  the  edge 

•  Optimization  can  be  solved  analytically:  leads  to  the  same 
solution  as  the  finite  difference  approximation 


X  —  A^i  -|-  (1  —  y)x2 

i9(x)  —  At?!  +  (1  —  A)'j92 

i9q  —  min  'i?(£)  +  c(x) 
Ae[o,i] 


X  —  Xq 


792 
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How  Do  the  Paths  Compare? 

•  Solid:  eight  neighbor  discrete  Dijkstra 

•  Dashed:  Fast  Marching  on  Cartesian  grid 
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FMM  for  Robot  Path  Planning 


Find  shortest  path  to  objective  while 
avoiding  obstacles 

-  Obstacle  maps  from  laser  scanner 

-  Configuration  space  accounts  for 
robot  shape 

-  Cost  function  essentially  binary 

Value  function  measures  cost  to  go 

-  Solution  of  Eikonal  equation 

-  Gradient  determines  optimal  control 


typical  laser  scan  with 
configuration  space  obstacles 
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Continuous  Value  Function  Approximation 

•  Contours  are  value  function 

-  Constant  unit  cost  in  free  space,  very  high  cost  near  obstacles 

•  Gradient  descent  to  generate  the  path 
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Comparing  the  Paths  with  Obstacles 

•  Value  function  from  discrete  Dijkstra  shows  faceting 

-  Paths  tend  to  follow  graph  edges  even  with  action  interpolation 

•  Value  function  from  fast  marching  is  smoother 

-  Can  still  have  large  errors  on  large  simplices  or  near  target 


discrete  Dijkstra’s  algorithm  (8  neighbors)  continuous  fast  marching  method 
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-  FMM  pros  &  cons 
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-  Alternative  action  norms 
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DP  leads  to  Hamilton-Jacobi  Equations 

•  Difrerent  cost  functionals  lead  to  different 
types  of  Hamilton-Jacobi  equation 

•  Finite  Horizon:  fixed  T 

(/)(a:(t),  f)  =  inf  /  ds 

u(-)  [Jt 

solves  for  x  g  and  (j)(x,T)  =  g(x) 


Dt(f){x,  t)-|-min  [f{x,  u)  •  Dxcl){x,  t)  +  i{x,  u)]  =  0 

ueu 


•  Target  Set:  T  =  min{t  >  0 


x(t)  G  T} 


cT 

^(^o)  =  /  £{x{s),u{s))  ds 

u(-)  Jo 

solves  for  a;  g  \  T  and  d(dT)  =  0 


min  [f(x,  u)  •  Dx^O^xq)  +  i{x,  u)]  =  0 

ueu 
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Hamilton-Jacobi  Flavours 


•  Time-dependent  Hamilton-Jacobi  used  for  dynamic  implicit 
surfaces  and  finite  horizon  optimal  control  /  differential  games 

Dt(j){x,  t)  +  H{x,  Dx(t)(x,  t))  =  0 


-  Solution  continuous  but  not  necessarily  differentiable 

-  Time  stepping  approximation  with  high  order  accurate  schemes 

-  Numerical  schemes  have  conservation  law  analogues 

•  Stationary  (static)  Hamilton-Jacobi  used  for  target  based  cost  to 
go  and  time  to  reach  problems 


Hix,  Dx'Dix))  =  0 


Dx^ix) 


-  Solution  may  be  discontinuous 

-  Many  competing  algorithms,  variety  of  speed  &  accuracy 

-  Numerical  schemes  use  characteristics  (trajectories)  of  solution 
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Solving  Static  HJ  PDEs 

•  Two  methods  available  for  using  time-dependent  techniques  to 
solve  the  static  problem 

-  Iterate  time-dependent  version  until  Hamiltonian  is  zero 

-  Transform  into  a  front  propagation  problem 

•  Schemes  designed  specifically  for  static  HJ  PDEs  are 
essentially  continuous  versions  of  value  iteration  from  dynamic 
programming 

-  Approximate  the  value  at  each  node  in  terms  of  the  values  at  its 
neighbours  (in  a  consistent  manner) 

-  Details  of  this  process  define  the  “local  update” 

-  Eulerian  schemes,  plus  a  variety  of  semi-Lagrangian 

•  Result  is  a  collection  of  coupled  nonlinear  equations  for  the 
values  of  all  nodes  in  terms  of  all  the  other  nodes 

•  Two  value  iteration  methods  for  solving  this  collection  of 
equations:  marching  and  sweeping 

-  Correspond  to  label  setting  and  label  correcting  in  graph  algorithms 
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Convergence  of  Time-Dependent  Version 

H(^x,  Dx'&(x))  =  0  for  X  E  Q.\'T 
i9(x)  =  0  for  X  G  9T 

•  Time-dependent  version:  repiace  i?(a;)  ^  and  add 
temporai  derivative 

x)  +  H{x,  x))  =  0 

-  Solve  until  D^i9{t,x)  =  0,  so  that  i9(t,x)  =  'd{x) 

•  Not  a  good  idea 

-  No  reason  to  believe  that  D^^{t,x)  ^  0  in  general 

-  In  limit  t^oo,  there  is  no  guarantee  that  i^(t,x)  remains 
continuous,  so  numerical  methods  may  fail 
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Transformation  to  Time-Dependent  HJ 

Create  implicit  surface  definition  of  T 

^  0,  X  G  '7'] 

if(x,  0)  <  =  o,x  e  dT : 

0,x  eR^\T. 

Under  assumption  Dx^p{x,0)  -  p  ^  0  on  dT, 
make  change  of  variables 

Dtp{x,t) 

and  get  toolbox  appropriate  PDE 

A„(x,  0  +  min  a:^<£4_£  =  0. 

After  solving,  set 'd  to  be  crossing  time 


'd^x)  =  {t  I  =  0}. 
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Methods:  Time-Dependent  Transform 


Equivalent  to  a  wavefront  propagation 
problem 

Pros: 

-  Implicit  surface  function  for  wavefront  is 
always  continuous 

-  Handles  anisotropy,  nonconvexity 

-  High  order  accuracy  schemes  available  on 
uniform  Cartesian  grid 

-  Subgrid  resolution  of  obstacles  through 
implicit  surface  representation 

-  Can  be  parallelized 

-  ToolboxLS  code  is  available 

(http : //www. cs .ubc . ca/~mitchell/ ToolboxLS ) 

Cons: 

-  CFL  requires  many  timesteps 

-  Computation  over  entire  grid  at  each 
timestep 


expanding  wavefront 
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Methods:  Fast  Marching  (FM) 

•  Dijkstra’s  algorithm  with  a  consistent  node  update  formula 

•  Pros: 

-  Efficient,  single  pass 

-  Isotropic  case  relatively  easy  to  implement 

•  Cons: 

-  Random  memory  access  pattern 

-  No  advantage  from  accurate  initial  guess 

-  Requires  causality  relationship  between  node  values 

-  Anisotropic  case  (Ordered  Upwind  Method)  trickier  to  implement 
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Methods:  Fast  Sweeping  (FS) 


Gauss-Seidel  iteration  through  the  grid 


-  For  a  particular  node,  use  a  consistent 
update  (same  as  fast  marching) 

-  Several  different  node  orderings  are 
used  in  the  hope  of  quickly  propagating 
information  along  characteristics 

Pros: 

-  Easy  to  implement 

-  Predictable  memory  access  pattern 

-  Handles  anisotropy,  nonconvexity, 
obtuse  unstructured  grids 

-  May  benefit  from  accurate  initial  guess 
Cons: 

-  Multiple  sweeps  required  for 
convergence 


sweep  1  sweep  2 


-  Number  of  sweeps  is  problem  dependent 
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Cost  Depends  on... 

•  So  far  assumed  that  cost  depends  only  on  position 

•  More  generally,  cost  could  depend  on  position  and  direction  of 
motion  (eg  action  /  input) 

-  Variable  dependence  on  position:  inhomogenous  cost 

-  Variable  dependence  on  direction:  anisotropic  cost 

•  Discrete  graph 

-  Cost  is  associated  with  edges  instead  of  nodes 

-  Dijkstra’s  algorithm  is  essentially  unchanged 

•  Continuous  space 

-  Static  HJ  PDE  no  longer  reduces  to  the  Eikonal  equation 


min  [Dx'^^x)  •  u  +  c{x)]  =  0 

ueu 


Dx'0(x)\\  =  c(x) 


when  U  is  not  a  circle  /  sphere 


-  Gradient  of  may  not  be  the  optimal  direction  of  motion 
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Interpreting  Isotropic  vs  Anisotropic 

•  For  planar  problems,  cost  can  be  interpreted  as  inverse  of  the 
speed  of  a  robot  at  point  x  and  heading  0=  atan(p2/pi) 

•  General  anisotropic  cost  depends  on  direction  of  motion 

—  P 
i(x,p) 

•  Isotropic  special  case:  robot  moves  in  any  direction  with  equal 
cost 


•  Related  to  but  a  stronger  condition  than 

-  holonomic 

-  small  time  controllable 
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Anisotropy  Leads  to  Causality  Problems 


•  To  compute  the  value  at  a  node,  we  look  back  along  the  optimal 
trajectory  (“characteristic”),  which  may  not  be  the  gradient 

•  Nodes  in  the  simplex  containing  the  characteristic  may  have 
vaiue  greater  than  the  current  node 

-  Under  Dijkstra’s  algorithm,  only  values  less  than  the  current  node 
are  known  to  be  correct 

•  Ordered  upwind  (OUM)  extension  of  FMM  searches  a  iarger  set 
of  simplices  to  find  one  whose  vaiues  are  ali  known 

FMM  uses  i?2  but  ~ 


'd2>  'do  >  '!?! 


5.^  OUM  uses  ^2 

7^0  ^  '^2  ^  '^1 
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Representing  Obstacles 

•  Computational  domain  should  not  include  (hard)  obstacles 

-  Requires  “body-fitted”  and  often  non-acute  grid:  straightforward  in 
2D,  challenging  in  3D,  open  problem  in  4D+ 

•  Alternative  is  to  give  nodes  inside  the  obstacle  a  very  high  cost 

-  Side  effect:  the  obstacle  boundary  is  blurred  by  interpolation 

•  Improved  resolution  around  obstacles  is  possible  with  semi- 
structured  adaptive  meshes 

-  Not  trivial  in  higher  dimensions;  acute  meshes  may  not  be  possible 


0.6  - 

0.4  - 

0.2  - 

0  - 

0.2  - 

0.4  - 

0.6  - 
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Adaptive  Meshing  is  Practically  Important 


•  Much  of  the  static  HJ  literature  involves  only 
2D  and/or  fixed  Cartesian  meshes  with 
square  aspect  ratios 

-  “Extension  to  variably  spaced  or  unstructured 
meshes  is  straightforward...” 

•  Nontrivial  path  planning  demands  adaptive 
meshes 

-  And  C-space  meshing,  and  dynamic  meshing, 
and  ... 


Cartesian  mesh’s  paths  adaptive  mesh 
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FMM  Does  Not  Do  Nondeterminism 

•  Probabilistic 

-  If  stochastic  behavior  is  Brownian,  HJ  PDE  becomes  (degenerate) 
elliptic  (static  HJ)  or  parabolic  (time-dependent  HJ) 

-  Lots  of  theory  available,  but  few  algorithms 

-  Leading  error  terms  in  approximation  schemes  often  behave  like 
dissipation  /  Brownian  motion  in  dynamics 

•  Worst  case  /  robust 

-  Optimal  control  problem  becomes  a  two  player,  zero  sum 
differential  game 

-  Also  called  “robust  optimal  control” 

-  Hamiltonian  is  not  convex  in  D^'d  and  causality  condition  may  fail 

H{x,  Dx'O)  =  max  min  [Dx'd{x)  •  f{x,  u,  d)  +  c(x)] 

deD  ueu 
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other  FMM  Issues 


•  Initial  guess 

-  FMM  gets  little  benefit  from  a  good  initial  guess  because  each 
node’s  value  is  computed  only  when  it  might  be  completely  correct 

-  Changing  the  value  of  any  node  can  potentially  change  any  other 
node  with  a  higher  value,  so  an  efficient  updating  algorithm  is  not 
trivial  to  design 

•  Focused  algorithms  (when  given  source  and  destination) 

-  A*  is  a  version  of  Dijkstra’s  algorithm  that  ignores  some  nodes 
which  cannot  be  on  the  optimal  path 

-  FMM  updates  depend  on  neighboring  simplices  rather  than 
individual  nodes,  so  there  is  no  straightforward  adaptation  of  A* 

•  Non-holonomic 

-  The  value  function  may  not  be  continuous  if  some  directions  of 
motion  are  forbidden 

-  Without  continuity  on  a  simplex,  interpolation  should  not  be  used  in 
the  local  updates 
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Outline 


Cost  Map 


Introduction 

-  Optimal  control 

-  Dynamic  programming  (DP) 

Path  Planning 

-  Discrete  planning  as  optimal  control 

-  Dijkstra’s  algorithm  &  its  problems 

-  Continuous  DP  &  the  Hamilton- 
Jacobi  (HJ)  PDE 

-  The  fast  marching  method  (FMM): 
Dijkstra’s  for  continuous  spaces 

Algorithms  for  Static  HJ  PDEs 

-  Four  alternatives 
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-  FMM  pros  &  cons 
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-  Alternative  action  norms 

-  Multiple  objective  planning 
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Why  the  Euclidean  Norm? 

•  We  have  thus  far  assumed  11-112  bound,  but  it  is  not  always  best 

•  For  example:  robot  arm  with  joint  angle  state  space 

-  All  joints  may  move  independently  at  maximum  speed:  H-H^ 

-  Total  power  drawn  by  all  joints  is  bounded:  H-H^ 

•  If  action  is  bounded  in  H-Hp,  then  value  function  is  solution  of 
“Eikonal”  equation  P(a;)||p.  =  c(x)  in  the  dual  norm  p* 

-  p  =  1  and  p  =  00  are  duals,  and  p  =  2  is  its  own  dual 

•  Straightforward  to  derive  update  equations  for  p  =  1 ,  p  =  oo 


Alton  &  Mitchell 
ICRA  2006 
and 

accepted  to 
SINUM  2008 


State  space 
X  e[0,  2k)^ 
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Update  Formulas  for  Other  Norms 

•  Straightforward  to  derive  update  equations  for  p  =  1 ,  p  =  oo 


||Vi?(a;o)||i  =  ^  (|i9o  -  "i^ll  +  l-i^O  -  '1^21) 
'i?olp*=l  =  i  (A.xc(xo)  +  191+  ^2) 


l|Vi9(a;o)||oo  =  ^^maxdi^o  -'<9i|,|i9o  -^2!) 
i9o|p*=oo  =  Axc(xo)  +  min  (i9i,i92) 


Infinity  Norm 


Action  bound  p  =  oo,  so 
update  formula  p*  =  1 

Right:  optimal  trajectory  of 
two  joint  arm  under  11-112  (red) 
and  IML  (biue) 

Beiow:  one  joint  and  slider 
arm  under  H-H^ 


Mixtures  of  Norms:  Multiple  Vehicles 

•  May  even  be  situations  where  action  norm  bounds  are  mixed 

-  Red  robot  starts  on  right,  may  move  any  direction  in  2D 

-  Blue  robot  starts  on  left,  constrained  to  1 D  circular  path 

-  Cost  encodes  black  obstacles  and  collision  states 

-  2D  robot  action  constrained  in  IML  and  combined  action  in  IMI 

Wild  I  I  I  loo 


Mixtures  of  Norms:  Multiple  Vehicles 

•  Now  consider  two  robots  free  to  move  in  the  plane 


f 

/^d'i9(x)  di9(x)\ 

fd'd{x) 

1 

V  ’  dx2  ) 

2  ’  V  <9^3  ’  dx/\.  ) 

2 

Constrained  Path  Planning 


•  Input  includes  multiple  cost  functions  c^(x)  Mitchell  &  Sastry, 

“Continuous  Path  Planning 

Possible  goals.  with  Multiple  Constraints,” 

-  Find  feasible  paths  given  bounds  on  each  cost  CDC  2003 

-  Optimize  one  cost  subject  to  bounds  on  the  others 

-  Given  a  feasible/optimal  path,  determine  marginals  of  the 
constraining  costs 


Variable  cost  (eg  threat  level) 


Threat  Cost  Map 


0  0 


Constant  cost  (eg  fuel) 


Fuel  Cost  Map 
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Path  Integrals 


•  To  determine  if  path  p{t)  is  feasible,  we  must  determine 


target 


Jo  [p(T)  =  X 

•  If  the  path  is  generated  from  a  value  function  »?(x),  then  path 
integrals  can  be  computed  by  solving  the  PDE 

DxPi(x)  ■  Dx'&{x)  —  Ci(x)c(x) 

•  The  computation  of  the  Pj(x)  can  be  integrated  into  the  FMM 
algorithm  that  computes  t?(x) 
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Pareto  Optimality 

•  Consider  a  single  point  x  and  a  set  of  costs  c^{x) 

•  Path  is  unambiguously  better  than  path  if 

Pi{x\Pm)  <  Piix'.Pn)  for  all  i 

•  Pareto  optimal  surface  Is  the  set  of  all  paths  for  which  there  are 
no  other  paths  that  are  unambiguously  better 
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Exploring  the  Pareto  Surface 

•  Compute  value  function  for  a  convex  combination  of  cost 
functions 

-  For  example,  let  c{x)  =  Xc^{x)  +  (1  -  'k)c^{x),  A,  e  [  0,1  ] 

•  Use  FMM  to  compute  corresponding  '&{x)  and  P^{x) 

•  Constructs  a  convex  approximation  of  the  Pareto  surface  for 
each  point  x  in  the  state  space 
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Constrained  Path  Planning  Example 

•  Plan  a  path  across  Squaraguay 

-  From  Lowerleftville  to  Upper  Right  City 

-  Costs  are  fuel  (constant)  and  threat  of  a  storm 


Weather  cost  (two  views) 
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Weather  and  Fuel  Constrained  Paths 


line  type 

minimize 

what? 

fuel 

constraint 

fuel 

cost 

weather 

cost 

fuel 

none 

1.14 

8.81 

weather 

1.3 

1.27 

4.55 

weather 

1.6 

1.58 

3.03 

weather 

none 

2.69 

2.71 
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-  Pareto  approx 

*  100  samples 

*  201  samples 

*  401  samples 


Pareto  Optimal  Approximation 


2.8 


2.6 


•  Cost  depends  linearly  on  number  of  sample  X  values 

-  For  2012  grid  401  x  samples,  execution  time  53  seconds 


2.4 


2.2 


For  Destination  =  [  0.9,  0.9  ] 


5  6 

Weather  Cost 
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More  Constraints 

•  Plan  a  path  across  Squaraguay 

-  From  Lowerleftville  to  Upper  Right  City 

-  There  are  no  weather  stations  in  northwest  Squaraguay 

-  Third  cost  function  is  uncertainty  in  weather 


Uncertainty  cost  (two  views) 
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Three  Costs 


line 

type 

minimize 

what? 

fuel 

constraint 

weather 

constraint 

fuel 

cost 

weather 

cost 

uncertainty 

cost 

fuel 

none 

none 

1.14 

8.81 

1.50 

weather 

none 

none 

2.69 

2.71 

5.83 

uncertainty 

none 
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1.17 

8.41 

1.17 

weather 

1.6 
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1.60 

3.02 

2.84 

weather 

1.3 
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1.30 

4.42 

2.58 

uncertainty 

1.3 

6.0 

1.23 

5.84 

1.23 
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Pareto  Surface  Approximation 

•  Cost  depends  linearly  on  number  of  sample  X  values 

-  For  201 2  grid  and  1 01  ^  A,  samples,  execution  time  1 3  minutes 

For  Destination  =  [  0.9,  0.9  ] 
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Fuel  Cost 


Three  Dimensions 


line  type 

minimize 

what? 

fuel 

constraint 

fuel 

cost 

weather 

cost 

fuel 

none 

1.14 

3.54 

— 

weather 

none 

1.64 

1.64 

weather 

1.55 

1.55 

2.00 

Constrained  Example 


•  Plan  path  to  selected  sites 

-  Threat  cost  function  is  maximum  of  individual  threats 


•  For  each  target,  plan  3  paths 

-  minimum  threat,  minimum  fuel,  minimum  threat  (with  fuel  ^  300) 


Threat  Cost  Map 


>. 


0  50  100  150  200  250  300  350  400 

X 


threat  cost 


X  (km) 


Paths  (on  value  function) 
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Future  Work 

•  Fast  Sweeping  and  Marching  code 

-  Python  &  C++ 

-  Interfaced  to  time-dependent  HJ  Toolbox  and  Matlab 

•  Robotic  applications 

-  Mesh  refinement  strategies 

-  Integration  with  locaiization  algorithms 

-  Practical  implementation 

•  Higher  dimensions? 

-  Taking  advantage  of  special  structure 

-  Integration  with  suboptimai  but  scaiable  techniques 
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Not  Discussed 

•  Time  dependent  HJ  PDEs 

-  Toolbox  of  Level  Set  Methods 

•  Reach  sets 

-  Safe  control  synthesis 

-  Abstraction  for  verification 

•  Particle  level  sets 

-  Improving  volume  conservation 
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